Goto

Collaborating Authors

 single-site gibbs




Reviews: Meta-Learning MCMC Proposals

Neural Information Processing Systems

For example, one would expect some discussion about extracting motifs in the context of a concrete PPL. Also, the paper assumes that the motifs can be extracted at the beginning of inference (Algorithm 1). However, in PPLs the graph structure changes with different values of the latent variables, so it doesn't appear applicable.


7f53f8c6c730af6aeb52e66eb74d8507-Reviews.html

Neural Information Processing Systems

This paper considers learning to sample from the posterior distribution of a model, by directly predicting latent variables from data. The idea is tested in the block MCMC context, where a small block of latents are predicted from the current state of other latents (and the data). This is shown to perform better than single-site Gibbs when variables are highly correlated and there is sufficient data to train the predictors. The paper is well written and has a reasonable evaluation. The comparison between block MCMC and single-site Gibbs is unsurprising.


Meta-Learning MCMC Proposals

Wang, Tongzhou, WU, YI, Moore, Dave, Russell, Stuart J.

Neural Information Processing Systems

Effective implementations of sampling-based probabilistic inference often require manually constructed, model-specific proposals. Inspired by recent progresses in meta-learning for training learning agents that can generalize to unseen environments, we propose a meta-learning approach to building effective and generalizable MCMC proposals. We parametrize the proposal as a neural network to provide fast approximations to block Gibbs conditionals. The learned neural proposals generalize to occurrences of common structural motifs across different models, allowing for the construction of a library of learned inference primitives that can accelerate inference on unseen models with no model-specific training required. We explore several applications including open-universe Gaussian mixture models, in which our learned proposals outperform a hand-tuned sampler, and a real-world named entity recognition task, in which our sampler yields higher final F1 scores than classical single-site Gibbs sampling.


Meta-Learning MCMC Proposals

Wang, Tongzhou, WU, YI, Moore, Dave, Russell, Stuart J.

Neural Information Processing Systems

Effective implementations of sampling-based probabilistic inference often require manually constructed, model-specific proposals. Inspired by recent progresses in meta-learning for training learning agents that can generalize to unseen environments, we propose a meta-learning approach to building effective and generalizable MCMC proposals. We parametrize the proposal as a neural network to provide fast approximations to block Gibbs conditionals. The learned neural proposals generalize to occurrences of common structural motifs across different models, allowing for the construction of a library of learned inference primitives that can accelerate inference on unseen models with no model-specific training required. We explore several applications including open-universe Gaussian mixture models, in which our learned proposals outperform a hand-tuned sampler, and a real-world named entity recognition task, in which our sampler yields higher final F1 scores than classical single-site Gibbs sampling.


Neural Block Sampling

Wang, Tongzhou, Wu, Yi, Moore, David A., Russell, Stuart J.

arXiv.org Machine Learning

Efficient Monte Carlo inference often requires manual construction of model-specific proposals. We propose an approach to automated proposal construction by training neural networks to provide fast approximations to block Gibbs conditionals. The learned proposals generalize to occurrences of common structural motifs both within a given model and across different models, allowing for the construction of a library of learned inference primitives that can accelerate inference on unseen models with no model-specific training required. We explore several applications including open-universe Gaussian mixture models, in which our learned proposals outperform a hand-tuned sampler, and a real-world named entity recognition task, in which our sampler's ability to escape local modes yields higher final F1 scores than single-site Gibbs.